Device for electronically calculating a fourier transform and method of minimizing the size of internal data paths within such a device

ABSTRACT

In order to minimize the size of internal data paths within a device with a series or pipelined architecture for calculating a Fourier transform of a predetermined initial size, a sequence of Fourier transform elementary processing operations of predetermined elementary sizes smaller than the initial size are performed on data blocks with successively reduced sizes from one elementary processing operation to the next. A global dynamic value is determined for each data block derived from a current elementary processing operation, based on dynamic values of all of the data of the block. The block data are then reframed, taking into account the global dynamic value, before full subsequent elementary processing on said data is carried out.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention concerns Fourier transform calculation devices having aso-called serial or "pipeline" architecture, and their mode ofoperation.

2. Description of the Related Art

The literature describes many implementations of Fourier transforms thatare either dedicated or programmed on signal processing microprocessors.Most of these implementations use a variant of the Cooley-Tukeyalgorithm, which is well known to the person skilled in the art, andwhich reduces the number of arithmetic operations required to calculatethe Fourier transform. Among other things, this algorithm simplifies thecalculation of a fast Fourier transform of size r^(p) where r representsthe "root". as it is usually called by the person skilled in the art, bybreaking the calculation down into the calculation of r Fouriertransforms of size r^(p-1) with further complex multiplications andadditions. By applying this simplification iteratively the calculationof Fourier transforms of size r is made easy, especially if r is chosenas equal to 2 or 4, with intermediate additions and multiplications.

The Cooley-Tukey algorithm uses a calculation graph with a butterflyshape, well known to the person skilled in the art.

Various hardware architectures can be used to implement this butterflycalculation structure.

A first solution uses a respective hardware operator capable of carryingout a butterfly type calculation for each butterfly of the graph. Thissolution is feasible only for implementation of small Fouriertransforms, however.

A second solution uses a single butterfly hardware operator whichcarries out in succession the calculations corresponding to all thebutterflies of all the stages of the graph. A solution of this kind hasthe drawback of needing a very fast hardware operator and an inputmemory separate from the memory into which intermediate calculationresults are written. This is to avoid access conflict when a data blockenters the operator while the preceding block is still being processed.It is therefore necessary to provide two memories each having a capacityof N complex words where N denotes the size of the Fourier transform,with the result that the circuit as a whole has a large surface area,especially if N is large.

An intermediate solution is to use a butterfly type hardware operatorfor each stage of the graph and a memory element the function of whichis to present the data to the input of the operator in the correctorder, given the butterflies of the graph of the stage in question.

Architectures of this kind are called series or pipeline architecturesby the person skilled in the art.

A pipeline architecture circuit for calculating a Fourier transform ofpredetermined initial size comprises a plurality of successiveprocessing stages connected in series between the input and the outputof the circuit by internal data paths. Each stage includes butterflytype processing means for processing Fourier transforms having apredetermined size smaller than the initial size using blocks of data ofprogressively decreasing size from one stage to the next. Thesetransform sizes can be the same and equal to the root of the Fouriertransform. The expression "uniform root Fourier transform" is then used.The transform sizes can be different from one stage to the next in thecase of "mixed" root Fourier transforms.

One example of a pipeline architecture of this kind is described in thearticle by BI and JONES entitled "A Pipelined FFT Processor forWord-Sequential Data", IEEE Transactions on Acoustic Speech and SignalProcessing, vol. 37, No. 12, December 1989, p. 1982-1985.

Independently of the type of architecture used, the problem arises ofthe dynamics of the intermediate and output data, given the dynamic ofthe input data. By "dynamic" is meant in the present context the numberof bits, including the sign bit, used to represent the data. Butterflytype hardware operators carry out complex multiplications and additions.It is of course unrealistic to retain all the bits of the resultsobtained multiplication by multiplication. It is standard practice, inparticular in pipeline architectures, to work with a constant dynamic,i.e. to represent input, intermediate and output data on the same numberof bits.

However, if the dynamic is constant, the dynamic value of theintermediate data cannot be known in advance. By the "dynamic value" ofthe data is meant in this context the range of values in which the datais situated, for example between -0.5 and +0.5, or between -0.05 and+0.05, and so on.

A first solution is to extend the dynamic of the data globally a priori,i.e. to estimate a priori the dynamic needed for the output data of thecircuit, so as not to lose too much accuracy on the significant bits,assuming that no saturation occurs in the internal calculations, andthereafter increasing the size of the input data words by the estimatednumber of additional bits.

Intermediate and output data are therefore also represented by words ofthis size. This increases the size of the internal data paths of thecircuit, which can be large in the case of large Fourier transformsrequiring several stages of processing, and this imposes a penalty interms of an increase in the total surface area of the circuit whenimplemented on a silicon chip.

Another solution also extends the dynamic of the data paths a priori,but stage by stage. This solution is definitely more advantageous thanthe first, but it also leads to an artificial increase in the size ofthe internal data paths of the data circuit and therefore in its surfacearea.

The article by BI and JONES previously referred to does not mention anysolution to this problem of the dynamic of the data.

SUMMARY OF THE INVENTION

The invention is directed to providing a more satisfactory solution tothis problem.

One object of the invention is to propose a Fourier transformcalculation device of the constant dynamic type in which the size of theinternal data paths is not artificially increased.

The invention therefore consists in an electronic device for calculatinga Fourier transform of a predetermined initial size, comprising aplurality of successive processing stages connected in series betweenthe input and the output of the device by internal data paths andcomprising respective processing means adapted to carry out Fouriertransform processing of predetermined sizes smaller than the initialsize on blocks of data of successively reduced size from one stage tothe next. In accordance with one general feature of the invention thedevice comprises:

means for determining a global dynamic value for each data blocksupplied by the processing means of a previous processing stage fromdynamic values of all the data of said block,

means for delaying supply of the data of said block to the processingmeans of the current stage at least until all the data of said block hasbeen supplied by the processing means of the previous processing stage,

intermediate rejustification means for rejustifying the data of saidblock allowing for the corresponding global dynamic value and forsupplying the rejustified data to the processing means of the currentprocessing stage, and

means for determining final dynamic values associated with the outputdata and obtained from the global dynamic values successivelycalculated,

in such a way as to minimize the size of the internal data paths of thedevice.

In other words, the invention provides adaptive rejustification, i.e.rejustification that allows for a dynamic value calculated on datablocks of progressively smaller size from one stage to the next.

In one embodiment of the device of the invention the input data isreceived sequentially at an input frequency determined by a basic clocksignal. The processing means of the tth stage are adapted to carry outFourier transform processing of size r_(t) on successive data blocks atthe frequency of the basic clock signal. The time-delay means includefirst selective time-delay means adapted to memorize blocks of dataobtained from those supplied by the processing means of the previousstage and to supply to the processing means of the current stage at therate of the basic clock signal and with a predetermined time-delay, foreach block received, successive groups of r_(t) data words in apredetermined order. The time-delay means also include second time-delaymeans connected to the first and also timed by the basic clock signal.The first and second time-delay means jointly memorize all the data ofeach block from the processing means of the preceding stage. In otherwords, the combined memory capacity of the first and second time-delaymeans is at least equal to the number of data words.

The first selective time-delay means of the tth stage advantageouslyhave r_(t) outputs connected to the processing means of that stage andtwo sets of r_(t) -1 time-delay elements connected in series. The outputof the final time-delay element of the first set is connected directlyto one output of the r_(t) outputs of the first time-delay means, andthe outputs of the time-delay elements of the second set arerespectively connected to the other outputs of the first time-delaymeans via one input of selective switching means having two inputs. Theinputs of the time-delay elements of the first set are respectivelyconnected to the other inputs of the two-input selective switchingmeans. The second time-delay means include a time-delay elementconnected to the input of the first time-delay means. All the time-delayelements advantageously have the same memory size.

The time-delay elements preferably include dynamic delay lines.

In one embodiment of the device the means for determining the globaldynamic value of each block include means for determining the number ofduplicated sign bits of each data word of the block, said global dynamicvalue associated with the block being the smallest of the numbers ofduplicated sign bits.

The intermediate rejustification means advantageously include means forshifting the bits of each data word of the block towards the mostsignificant bit, said shift means being connected to the means fordetermining the global dynamic value.

The time-delay means are advantageously between the processing means ofeach pair of consecutive stages, while the means for determining globaldynamic values are connected to the output of the processing means ofeach stage. In one embodiment the processing means of each stage includea set of adders/subtractors followed by a multiplier and in that theshift means are on the input side of the set of adders/subtractors orbetween said set and the multiplier. The means for determining finaldynamic values can include a succession of registers of the same sizetimed by respective clock signals of increasing frequency from one stageto the next and respectively connected to the means for determiningglobal dynamic values of the corresponding stage and to the precedingregister via an adder.

The invention also consists in a method of minimizing the size ofinternal data paths of a device for calculating a Fourier transform ofpredetermined initial size using a series or "pipeline" architecture inwhich device a succession of Fourier transform processing operations ofpredetermined sizes smaller than the initial size are carried out onblocks of data of successively reducing size from one processingoperation to the next. According to one general feature of the inventiona global dynamic value is determined for each data block from a currentprocessing operation from dynamic values of all the data of the blockand the data of the block is rejustified allowing for said globaldynamic value before the next processing operation is carried out onthis data.

In one particular embodiment of the invention, in which the input dataof the device is timed at a predetermined basic clock rate, eachprocessing operation is timed by said basic clock and in that the startof the next processing operation on the data of the block from thecurrent processing operation is delayed by at least a number of basicclock cycles equal to the number of data words of the block, afteracquisition of the first data of the block from the current processingoperation.

The global dynamic value of each block is advantageously determined bydetecting the number of sign bits of all the data words of the block,the global dynamic value being the smallest of the numbers of duplicatedsign bits of said data.

The data of the block is preferably rejustified by shifting all the bitsof each data word towards the most significant bit by a number of bitsequal to the smallest of the numbers of duplicated sign bits.

In one embodiment of the method each global dynamic value of a blockwhich is the subject of a processing operation is incremented by theglobal dynamic value of each block obtained from this block after saidprocessing operation so as to obtain at the end of processing a finaldynamic value for each output data word.

Each output data word can then be rejustified using the final dynamicvalue associated with it if data output on a predetermined number ofbits is required. This rejustification can be dispensed with if afloating type representation is adopted. However, in this case, toobtain a correct result the output data value must be associated withits final dynamic value.

As each processing operation includes a combination of additions andsubtractions followed by a multiplication, the data can be rejustifiedbefore carrying out the additions and subtractions or after carrying outthe additions and subtractions but before carrying out themultiplication.

BRIEF DESCRIPTION OF THE DRAWINGS

Other advantages and features of the invention will emerge from areading of the detailed description of one non-limiting embodiment ofthe invention shown in the appended drawings in which:

FIG. 1 shows the butterfly structure calculation graph implemented in athree-stage pipeline architecture device,

FIGS. 2 and 3 show the architecture of the device corresponding to theFIG. 1 graph in more detail,

FIG. 4 is a diagram showing the hardware architecture of a butterflytype hardware operator as used in the FIG. 1 graph,

FIGS. 5 and 6 respectively show two data words before and afterintermediate rejustification,

FIG. 7 shows a timing diagram corresponding to the calculation graphfrom FIG. 1 and to the operation of the device from FIGS. 2, 3 and 4,and

FIG. 8 is a diagram showing a hardware implementation of a memory pointof a dynamic delay line usable in a device of the invention.

DETAILED DESCRIPTION OF THE INVENTION

In the example now described, the initial size N of the Fouriertransform is equal to 32 and the calculation is reduced to thecalculation of three Fourier transforms of sizes r₁, r₂ and r₃respectively equal to 4, 4 and 2. This is therefore a mixed root Fouriertransform since the size r_(t) of the tth processing stage (t=1, 2 or 3)is different for the first two stages and for the third and final stage.The invention naturally applies also to uniform root Fourier transforms.Each input data word is a complex word having a real part and animaginary part coded on n bits using 2's complement notation andjustified between -1 and 1.

Referring to FIGS. 1 and 2 in particular, the first processing stage ET1carries out size 4 Fourier transform processing on a block of 32 datawords (the input data). The second processing stage ET2 carries out sizer₂ =4 Fourier transform processing on four successive blocks B1, B2, B3and B4 each of eight data words. The last stage ET3 carries out size r₃=2 Fourier transform processing on 16 successive blocks B5 through B20each of two data words.

Generalizing this, in a succession of processing stages, the tth stagecarries out size r_(t) Fourier transform processing on successive blocksof r_(t) N_(t) data words where ##EQU1## where n denotes the "product"function.

The first processing stage ET1 carries out butterfly type processing ongroups of four data words. The output (intermediate) data Xi obtainedafter processing by the butterflies and multiplication by a coefficient##EQU2## equal to the complex number e^(-j2mn/N) can be subdivided intofour blocks each of eight data words. Butterfly type processing iscarried out on each of these blocks on groups of four data words. Aftermultiplication by the coefficient Wq the output data Yi can be dividedinto 16 blocks of two data words each of which is processed by abutterfly type operator of a size 2 Fourier transform. FIG. 1 shows thissize 2 butterfly operator without the associated multiplier, because itis located at the end of the system. The structure of this operator iswell known to the person skilled in the art and is shown in "Theory andApplication of Digital Signal Processing" by Lawrence R. Rabiner andBernard Gold, for example.

Similarly, the person skilled in the art will realize that the order ofthe data is not the same from one stage to the next. Accordingly,although the input data arrives in the order 0, 1, 2, . . . , 31, asshown in FIG. 2 in particular, in this example the output data isdelivered in the order 0, 16, 4, . . . , 31.

As described in more detail later, in each stage from the second stageonwards a global dynamic value E1-E20 is determined for each data blockB1-B20 from dynamic values of all the data of the block. This enablesrejustification of the data of the block before processing in thebutterfly operator. Of course, it may be possible (although it is notindispensable) to provide means for calculating a global dynamic valuefor the block of 32 input data words, since these have a commonjustification such that their real and imaginary parts are all between-1 and 1.

FIGS. 2 through 4 show the hardware architecture of a circuit Dimplementing the FIG. 1 calculation graph. This circuit isadvantageously hardwired, i.e. integrated on a silicon chip, forexample, and being made up of discrete elements can be divided into asuccession of successive processing stages ET1, ET2, ET3 connected toeach other and between the data input ES and the data output OUT byn-bit internal data paths (buses). Each processing stage, for examplethe 2nd processing stage ET2, includes processing means EAS2, MC2 forcarrying out size r_(t) Fourier transform processing on successiveblocks of data of reduced size. The processing means EAS1, MC1 of the1st stage ET1 thus carry out size 4 Fourier transform processing onsuccessive groups of four input data words from the block of 32 inputdata words, in a predetermined order.

The processing means of the 2nd stage ET2 likewise carry out size 4Fourier transform processing on successive groups of four data wordsfrom block B1, in a predetermined order and corresponding to therespective butterflies of the calculation graph, and then another size 4Fourier transform calculation on successive groups of four data wordsfrom block B2, and so on up to the data of block B4.

The processing means of each stage, including the 2nd stage ET2, aretimed by a basic clock signal H delivered by an appropriate circuit BH2.The processing means EAS2, MC2 of stage ET2 are associated with firstselective time-delay means MRA2 adapted to deliver the successive groupsof r_(t) data words in a predetermined order to the processing meansEAS2, MC2 at the clock rate of the basic clock signal H and with apredetermined time-delay for each block received.

The first selective time-delay means MRA2 of the 2nd stage ET2 includefour outputs S21, S22, S23, S24 connected to the processing means EAS2,MC2. They also include two sets of three time-delay elements ER1-2through ER6-2 connected in series. The first set of elements includesthe elements ER1-2 through ER3-2. The second set of elements includesthe elements ER4-2 through ER6-2.

The output of the last time-delay element ER3-2 of the first set isconnected directly to the output S21. The outputs of the time-delayelements ER4-2 through ER6-2 of the second set are respectivelyconnected to the other outputs S22, S23, S24 by means of one input ofthe two-input selective switching means Ca, Cb, Cc.

The inputs of the time-delay elements of the first set are respectivelyconnected to the other inputs of the switching means Ca, Cb, Cc.

The switching means Ca, Cb, Cc are controlled by control signals fromcontrol logic LC2. One way of controlling these switches is described inthe previously mentioned article by BI and JONES, the content of whichis hereby incorporated by way of reference.

The "size" of each of the time-delay elements, i.e. the number of datawords that they can each store temporarily, is equal to two for the 2ndstage ET2. More generally, in a succession of processing stages, thesize of each of the time-delay elements is equal to ##EQU3## where t isthe stage number.

Of course, the person skilled in the art will realize that although theword "size" is used for simplicity in respect of the number of datawords to be stored, the storage capacity of each time-delay element isin fact greater than this since each data "word" is made up of two wordsrespectively representing its imaginary part and its real part.

Second time-delay means MRB2 include a time-delay element of the samesize as the time-delay elements of the first time-delay means MRA2 andthe output of which is connected to the input EN2 of the firsttime-delay means MRA2 (and therefore in this example to the input of thefirst time-delay element ER1-2 of the first set).

All these time-delay elements have sequential access memory means timedby the basic clock signal H. They may be implemented in shift registers,for example, or using first in/first out (FIFO) memories. It isparticularly advantageous, for reasons concerning their overall size, touse dynamic delay lines whose various memory points comprise threetransistors as shown in FIG. 8. The gates of the two transistors T1 andT2 are respectively controlled by write and read signals. They arerespectively connected between a write bus BEC and a read bus BLE andare also connected to ground via a third transistor T3. The value storedis held in the transistor T3.

The processing means of each stage, such as those of the stage ET2 showndiagrammatically in FIG. 4, include a set EAS2 of complexadders/subtractors (there are three of these devices in this example:AS1, AS2, AS3), followed by a complex multiplier MC2. The processingmeans have r_(t) inputs (four inputs in this example) connected to thetwo adders/subtractors AS1, AS2 by a multiplexer MUX controlled by thecontrol logic LC2. A more complete implementation of this type ofoperator is described in the previously mentioned article by BI andJONES. On simultaneously receiving four input data words a0, a1, a2 anda3 the processing means deliver four successive data output words b0,b1, b3, b4 corresponding to the Fourier transform of the input data.

Means CD1 for justifying (left-justifying, for example) on n bits dataXi from the processing means of the stage ET1 are provided between theoutput of the multiplier MC1 of the processing means of the stage ET1and the input of the stage ET2 (i.e. in this example the input of thesecond time-delay means MRB2).

The output of the means CD1 of the stage ET1 is connected to means DBS2capable of determining a global dynamic value for each block of datafrom the multiplier MC1 from dynamic values of all the data of theblock.

In concrete terms, if each data word is coded in 2's complement binarynotation on a predetermined number of bits, detection of the dynamicvalue of each data word is based on detection of the number of sign bitsduplicated in the word of that data word. The means DBS2 then includemeans for comparing the value of the most significant bit of the dataword with a certain number of immediately adjacent bits, for examplethree such bits. The number of adjacent bits equal to the sign bitdetermines the number of sign bits duplicated. In the example shown inFIG. 5 in which S denotes the sign bit and BT1 through BT6 denote thesignificant bits, the three bits on the left are identical, whichcorresponds to two duplicated sign bits.

The means DBS2 also include means for determining the smallest number ofduplicated sign bits for all the data of the block in question. Thissmallest number, which represents the global dynamic value of the block,is then stored in a register RGb2 controlled by a clock signal H2derived from the basic clock signal H.

Means for rejustifying data supplied by the time-delay means MRA2 areprovided between the output of these time-delay means and the input ofthe set EAS2 of adders/subtractors of the processing means of the stageET2. These data rejustification means include in this example a shifterDL2 adapted to shift all the data of a block towards the left, i.e.towards the most significant bit, by an amount equal to the numberstored in the register RGb2. Accordingly, as shown in FIG. 6, theshifted data word now has as the most significant bit the sign bit Sfollowed by six significant bits BT1 through BT6. The last two bits ofthis word, which before shifting have the values BT5 and BT6, now havethe value 0. The person skilled in the art will readily understand thatthis left shifting of the significant bits of the data word allowing forthe previously calculated global dynamic value preserves acceptableaccuracy of the data whilst retaining representation on n bits.

The output of the means DBS2 is connected to an adder A2 the other inputof which is connected to a data transmission bus BS1 and the output ofwhich is connected to another register RGa2 also controlled by the clocksignal H2. The output of the register RGa2 is connected to another partBS2 of the data transmission bus. The function of these means isdescribed in more detail below.

The operation of the device of the invention will now be described indetail with reference to the timing diagram shown in FIG. 7.

For simplicity, FIG. 7 is based on the assumption that the calculationscarried out by the adders/subtractors and the complex multipliers, thedetection of the number of duplicated sign bits and the addition in thevarious adders A1, A2, A3 (see FIG. 2 in particular) are carried out ina single clock cycle of the basic clock H.

The clock signal H2 of the stage ET2, the rising edges of which aresynchronized with the starts of the data blocks from the multiplier MC1,has a frequency equal to one eighth the frequency of the basic clocksignal H.

The data X0-X7 forms the first data block B1 from the processing meansof the stage ET1. The number of duplicated sign bits of each of thesedata words is detected in the means DBS2 and the global dynamic value E1of this block, i.e. the smallest number of duplicated sign bits, isstored in the register RGb2 on the next rising edge of the clock H2. Asand when the data of the block is supplied by the multiplier MC1, it isstored in the memory means MRB2 and MRA2 and then output sequentially ina predetermined order in groups of four at the four outputs S21, S22,S23 and S24. However, given the nature of the time-delay means MRA2 andMRB2 and their memory capacity, the first group of data X0, X6, X4, X2is present at the outputs S21, S22, S23 and S24 of the time-delay means(and thus ready to be processed by the processing means of the stageET2) only after all the data of the block X0-X7 has been supplied by themultiplier MC1.

In other words, in a succession of Fourier transform stages, the startof the next processing (i.e. the processing in the stage ET2 in thisexample) of the data of a block from the current processing (stage ET1)is delayed by a number of basic clock cycles at least equal to r_(t)N_(t), starting from acquisition of the first data of the block outputby the multiplier MC1.

The eight successive groups G1-G8 of four data words at the output ofthe time-delay means MRA2 are then shifted to the left by the value E1in shifter DL2 before they are forwarded to the set EAS2 ofadders/subtractors of the processing means of the stage ET2.

The person skilled in the art will realize that the time-delay elementMRB2 in indispensable. In the absence of this element some of the datafrom the multiplier MC1 would have been present at the input of the setEAS2 before all of the data of the block B1 had been supplied by themultiplier MC1. It would therefore have been impossible to rejustify thefirst group G1 of four data words supplied by the time-delay means MRA2and MRB2 to the set EAS2.

The same operations are carried out for the second data block B2 fromthe multiplier MC1. The global dynamic value E2 is also stored in theregister RGb2 so that the data of this block can be shifted before it isprocessed in the set EAS2.

Note that although in this example the shifter DL2 is on the input sideof the set EAS2 of adders/subtractors, it is advantageous to shift thedata between the output of this set EAS2 and the input of the multiplierMC2 because this simplifies the structure of the shifter.

There are therefore obtained at the output of the means DBS2 four globaldynamic values E1, E2, E3 and E4 respectively associated with the fourblocks B1, B2, B3 and B4. These four values are added in the adder A2 tothe input data available on the bus BS1. In this example this inputvalue is equal to 0 since no dynamic value has been calculated for theblock of 32 input data words. Consequently, the four global dynamicvalues E1-E4 are stored in the register RGa2.

At the output of the multiplier MC2 the data Yi is divided into 16blocks B5-B20 each of two data words, holding 16 global dynamic valuesE5-E20 used to rejustify the data, in a manner similar to thatpreviously explained, before the data is processed in the processingmeans of the stage ET3. The data blocks B5 through B8 are obtained fromthe data block B1, the data blocks B9 through B12 from the data blockB2, the data blocks B13 through B16 from the data block B2 and the datablocks B17 through B20 from the data block B4. The 16 values E5-E20 aresupplied to the register RGa3 of the stage ET3 (FIG. 2). The first fourvalues in this register are respectively equal to the global dynamicvalue E1 of the block B1 incremented by the four global dynamic valuesE5-E8 associated with the blocks B5-B8 obtained from the block B1; theother values are respectively equal to the sum of the other globaldynamic values E2 through E4 incremented by the global dynamic values ofthe blocks obtained from the three other blocks B2-B4. The size (numberof words) of the register RGa3 is the same as the size of the registerRGa2. However, as it is timed by the clock signal H3 which is four timesfaster than the clock signal H2, it can store four times as many values.

There is therefore obtained at the output of the device D a finaldynamic value for each output data word from which the total number ofbits by which the data word has been shifted can be determined. In theexample described each pair of output data words supplied by theprocessing means of the last stage is associated with the same finaldynamic value.

More generally, although two detections and shifts have been describedabove to make it easier to understand how the device operates, inpractice only one detection and one shift are carried out for theintermediate processing stages. In the case of a mixed root 32 pointFourier transform (4, 4, 2), only the second stage will includedetection and shifting. With respect to the final dynamic values, ifshifting and detection are applied in the final stage a final dynamicvalue will be associated with a block whose length is equal to the rootof that stage. On the other hand, if there is no processing in thisstage a final dynamic value will be associated with a block of a greaternumber of values (16 values if the last two stages have the root 4 andeight values for roots respectively equal to 4 and 2).

For the output data obtained to be correct, final rejustification of thedata is required at the output of the device, by shifting to the right,i.e. towards the least significant bit, by a number of bits equal to thefinal dynamic value associated with each data word. Dedicated shiftmeans MRF can be used for this (FIG. 2). However, these means are notindispensable if a "floating" representation is adopted, but in thiscase the device must have an auxiliary output supplying the variousfinal dynamic values in association with each of the output data words,so that this information can be acted on subsequently.

The person skilled in the art will realize that the invention enablesworking with a constant dynamic and minimizes the size of the internaldata paths of the circuits, limiting this size to n bits, withoutexcessive loss of precision in respect of the intermediate data. Thisenables the implementation of integrated circuits capable of processingFourier transforms with 8,192 complex points in 1 ms using submicronCMOS technology, suitable for applications in terrestrial digitaltelevision, without any unnecessary and undesirable increase in thesurface area of the circuit.

We claim:
 1. Electronic device for calculating a Fourier transform of a predetermined initial size, comprising a plurality (t) of successive processing stages where t is an integer, connected in series between an input and an output of the device by internal data paths and comprising respective processing means adapted to carry out Fourier transform processing of predetermined sizes (r_(t)) smaller than the initial size on blocks of data of successively reduced size from one said processing stage to a next, comprising:means for determining a global dynamic value for each said data block supplied by the processing means of a previous processing stage from dynamic values of all the data of each said data block; time-delay means for delaying supply of the data of each said data block to the processing means of a current processing stage at least until all the data of each said data block has been supplied by the processing means of the previous processing stage; intermediate rejustification means for rejustifying the data of each said data block allowing for the corresponding global dynamic value and for supplying the rejustified data to the processing means of the current processing stage; and means for determining final dynamic values associated with each output data at the output of the electronic device, supplied by the processing means of a last processing stage and obtained from the global dynamic values successively calculated in such a way to minimize the size of the internal data paths of the device.
 2. Device according to claim 1 wherein input data at the input of the device is received sequentially at an input frequency determined by a basic clock signal and the processing means of the t-th processing stage are adapted to carry out Fourier transform processing of size r_(t) on successive data blocks at the frequency of the basic clock signal;the time-delay means include first selective time-delay means adapted to memorize said data blocks supplied by the processing means of the previous processing stage and to supply to the processing means of the current processing stage at the frequency of the basic clock signal and with a predetermined time-delay, for each said data block received, successive groups of r_(t) data words in a predetermined order, and second time-delay means connected to the first time-delay means and also timed by the basic clock signal; and the first and second time-delay means have a combined memory capacity at least equal to the number of data words of each said data block from the processing means of the previous processing stage.
 3. Device according to claim 2 wherein the means for determining the final dynamic values include a succession of registers of the same size timed by respective clock signals of increasing frequency from one said processing stage to the next and are respectively connected to the means for determining the global dynamic values of the corresponding processing stage and to a preceding register via an adder.
 4. Device according to claim 2 wherein the first selective time-delay means of the t-th processing stage have r_(t) outputs connected to the processing means of the t-th processing stage and two sets of r_(t) -1 time-delay elements connected in series;the output of the final time-delay element of the first set is connected directly to one output of the r_(t) outputs of the first time-delay means; the outputs of the time-delay elements of the second set are respectively connected to the other outputs of the first time-delay means via one input of selective switching means having two inputs; the inputs of the time-delay elements of the first set are respectively connected to the other inputs of the two-input switching means; the second time-delay means include a time-delay element connected to the input of the first selective time-delay means; and all the time-delay elements have the same memory size.
 5. Device according to claim 4 wherein the time-delay elements include dynamic delay lines.
 6. Device according to claim 5 wherein the means for determining the global dynamic value of each said data block include means for determining the number of duplicated sign bits of each data word of the data block, said global dynamic value associated with the data block being the smallest of the numbers of duplicated sign bits.
 7. Device according to claim 1 wherein the means for determining the global dynamic value of each said data block include means for determining the number of duplicated sign bits of each data word of the data block, said global dynamic value associated with the data block being the smallest of the numbers of duplicated sign bits.
 8. Device according to claim 1 wherein the intermediate rejustification means include means for shifting the bits of each data word of the data block towards the most significant bit, said shift means being connected to the means for determining the global dynamic value.
 9. Device according to claim 8 wherein the processing means of each said processing stage include a set of adders/subtractors followed by a multiplier and wherein the shift means are on an input side of the set of adders/subtractors or between said set and the multiplier.
 10. Device according to claim 1 wherein the time-delay means are between the processing means of each pair of consecutive processing stages.
 11. Device according to claim 1 wherein the means for determining the global dynamic values are connected to the output of the processing means of each said processing stage.
 12. Device according to claim 1 wherein the intermediate rejustification means include means for shifting the bits of each data word of the data block towards the most significant bit, said shift means being connected to the means for determining the global dynamic value.
 13. Method of minimizing the size of internal data paths of a device for calculating a Fourier transform of predetermined initial size using a series or pipeline architecture in which device a succession of Fourier transform processing operations of predetermined sizes smaller than the initial size are carried out on blocks of data of successively reducing size from one said processing operation to a next, comprising the steps of:determining a global dynamic value for each said data block from a previous processing operation from dynamic values of all the data of the data block; rejustifying the data of each said data block allowing for said global dynamic value before a current processing operation is carried out on the data; and supplying the rejustified data to the current processing operation.
 14. Method according to claim 13 wherein input data at the input of the device is supplied at a timing rate of a predetermined basic clock (4) and each said processing operation is timed by said basic clock, further comprising the step of:delaying the start of the current processing operation on the data of each said data block from the previous processing operation by at least a number of basic clock cycles equal to the number of data words of the data block, after acquisition of the first data of the data block from the previous processing operation.
 15. Method according to claim 13, wherein the step of determining the global dynamic value of each said data block comprises the step of detecting the number of sign bits of all the data words of the data block, the global dynamic value being the smallest of the numbers of duplicated sign bits of said data.
 16. Method according to claim 15, wherein the step of rejustifying the data of the data block comprises the step of shifting all the bits of each said data word towards the most significant bit by a number of bits equal to the smallest of the numbers of duplicated sign bits.
 17. Method according to claim 16, further comprising the step of:incrementing the global dynamic value of each said data block which is the subject of the processing operation by the global dynamic value of each said data block obtained from the data block after said processing operation to obtain, at the end of processing, a final dynamic value for each data word obtained at an output of the device.
 18. Method of minimizing the size of internal data paths of a device for calculating a Fourier transform of predetermined initial size using a series or pipeline architecture in which device a succession of Fourier transform processing operations of predetermined sizes smaller than the initial size are carried out on blocks of data of successively reducing size from one said processing operation to a next, wherein input data at an input of the device is supplied at a timing rate of a predetermined basic clock (4), and each said processing operation is timed by said basic clock, comprising the steps of:determining a global dynamic value for each said data block from a previous processing operation from dynamic values of all the data of the data block; delaying the start of a current processing operation on the data of each said data block from the previous processing operation by at least a number of basic clock cycles equal to the number of data words of the data block, after acquisition of the first data of the data block from the previous processing operation; rejustifying the data of each said data block allowing for said global dynamic value before the current processing operation is carried out on the data; supplying the rejustified data to the current processing operation; and incrementing each said global dynamic value of the data block which is the subject of the processing operation by the global dynamic value of the data block obtained from the data block after said processing operation to obtain, at the end of processing, a final dynamic value for each output data word at an output of the device.
 19. Method according to claim 18, wherein the step of determining the global dynamic value of each said data block comprises a step of detecting the number of sign bits of all the data words of the data block, the global dynamic value being the smallest of the numbers of duplicated sign bits of said data.
 20. Method according to claim 19, wherein the step of rejustifying the data of each said data block comprises a step of shifting all the bits of each data word towards the most significant bit by a number of bits equal to the smallest of the numbers of duplicated sign bits. 